Goto

Collaborating Authors

 model aggregation


Low Precision Local Training is Enough for Federated Learning

Neural Information Processing Systems

Federated Learning (FL) is a prevalent machine learning paradigm designed to address challenges posed by heterogeneous client data while preserving data privacy. Unlike distributed training, it typically orchestrates resource-constrained edge devices to communicate via a low-bandwidth communication network with a central server. This urges the development of more computation and communication efficient training algorithms. In this paper, we propose an efficient FL paradigm, where the local models in the clients are trained with low-precision operations and communicated with the server in low precision format, while only the model aggregation in the server is performed with high-precision computation. We surprisingly find that high precision models can be recovered from the low precision local models with proper aggregation in the server.


MAR-FL: A Communication Efficient Peer-to-Peer Federated Learning System

Mulitze, Felix, Woisetschläger, Herbert, Jacobsen, Hans Arno

arXiv.org Artificial Intelligence

The convergence of next-generation wireless systems and distributed Machine Learning (ML) demands Federated Learning (FL) methods that remain efficient and robust with wireless connected peers and under network churn. Peer-to-peer (P2P) FL removes the bottleneck of a central coordinator, but existing approaches suffer from excessive communication complexity, limiting their scalability in practice. We introduce MAR-FL, a novel P2P FL system that leverages iterative group-based aggregation to substantially reduce communication overhead while retaining resilience to churn. MAR-FL achieves communication costs that scale as O(N log N), contrasting with the O(N^2) complexity of previously existing baselines, and thereby maintains effectiveness especially as the number of peers in an aggregation round grows. The system is robust towards unreliable FL clients and can integrate private computing.


Graph Theory Meets Federated Learning over Satellite Constellations: Spanning Aggregations, Network Formation, and Performance Optimization

Nadimi, Fardis, Abdisarabshali, Payam, Chakareski, Jacob, Mastronarde, Nicholas, Hosseinalipour, Seyyedali

arXiv.org Artificial Intelligence

In this work, we introduce Fed-Span: \textit{\underline{fed}erated learning with \underline{span}ning aggregation over low Earth orbit (LEO) satellite constellations}. Fed-Span aims to address critical challenges inherent to distributed learning in dynamic satellite networks, including intermittent satellite connectivity, heterogeneous computational capabilities of satellites, and time-varying satellites' datasets. At its core, Fed-Span leverages minimum spanning tree (MST) and minimum spanning forest (MSF) topologies to introduce spanning model aggregation and dispatching processes for distributed learning. To formalize Fed-Span, we offer a fresh perspective on MST/MSF topologies by formulating them through a set of continuous constraint representations (CCRs), thereby integrating these topologies into a distributed learning framework for satellite networks. Using these CCRs, we obtain the energy consumption and latency of operations in Fed-Span. Moreover, we derive novel convergence bounds for Fed-Span, accommodating its key system characteristics and degrees of freedom (i.e., tunable parameters). Finally, we propose a comprehensive optimization problem that jointly minimizes model prediction loss, energy consumption, and latency of {Fed-Span}. We unveil that this problem is NP-hard and develop a systematic approach to transform it into a geometric programming formulation, solved via successive convex optimization with performance guarantees. Through evaluations on real-world datasets, we demonstrate that Fed-Span outperforms existing methods, with faster model convergence, greater energy efficiency, and reduced latency.


FedQS: Optimizing Gradient and Model Aggregation for Semi-Asynchronous Federated Learning

Li, Yunbo, Gui, Jiaping, Deng, Zhihang, Meng, Fanchao, Wu, Yue

arXiv.org Artificial Intelligence

Federated learning (FL) enables collaborative model training across multiple parties without sharing raw data, with semi-asynchronous FL (SAFL) emerging as a balanced approach between synchronous and asynchronous FL. However, SAFL faces significant challenges in optimizing both gradient-based (e.g., FedSGD) and model-based (e.g., FedAvg) aggregation strategies, which exhibit distinct trade-offs in accuracy, convergence speed, and stability. While gradient aggregation achieves faster convergence and higher accuracy, it suffers from pronounced fluctuations, whereas model aggregation offers greater stability but slower convergence and suboptimal accuracy. This paper presents FedQS, the first framework to theoretically analyze and address these disparities in SAFL. FedQS introduces a divide-and-conquer strategy to handle client heterogeneity by classifying clients into four distinct types and adaptively optimizing their local training based on data distribution characteristics and available computational resources. Extensive experiments on computer vision, natural language processing, and real-world tasks demonstrate that FedQS achieves the highest accuracy, attains the lowest loss, and ranks among the fastest in convergence speed, outperforming state-of-the-art baselines. Our work bridges the gap between aggregation strategies in SAFL, offering a unified solution for stable, accurate, and efficient federated learning. The code and datasets are available at https://github.com/bkjod/FedQS_.


Chisme: Fully Decentralized Differentiated Deep Learning for IoT Intelligence

Kuttivelil, Harikrishna, Obraczka, Katia

arXiv.org Artificial Intelligence

As end-user device capability increases and demand for intelligent services at the Internet's edge rise, distributed learning has emerged as a key enabling technology. Existing approaches like federated learning (FL) and decentralized FL (DFL) enable distributed learning among clients, while gossip learning (GL) approaches have emerged to address the potential challenges in resource-constrained, connectivity-challenged infrastructure-less environments. However, most distributed learning approaches assume largely homogeneous data distributions and may not consider or exploit the heterogeneity of clients and their underlying data distributions. This paper introduces Chisme, a novel fully decentralized distributed learning algorithm designed to address the challenges of implementing robust intelligence in network edge contexts characterized by heterogeneous data distributions, episodic connectivity, and sparse network infrastructure. Chisme leverages cosine similarity-based data affinity heuristics calculated from received model exchanges to inform how much influence received models have when merging into the local model. By doing so, it facilitates stronger merging influence between clients with more similar model learning progressions, enabling clients to strategically balance between broader collaboration to build more general knowledge and more selective collaboration to build specific knowledge. We evaluate Chisme against contemporary approaches using image recognition and time-series prediction scenarios while considering different network connectivity conditions, representative of real-world distributed intelligent systems. Our experiments demonstrate that Chisme outperforms state-of-the-art edge intelligence approaches in almost every case -- clients using Chisme exhibit faster training convergence, lower final loss after training, and lower performance disparity between clients.


is formulating a robust, efficient training scheme with extensive results and analysis which is significant enough. "

Neural Information Processing Systems

We thank all reviewers for their time and their valuable feedback. We will add corrections/clarifications as suggested. We would like to emphasize our contribution, as summarized by R5's thorough review: "What's novel about this paper Our proposed FedDF is not the mentioned engineering solution. FL round; thus mutual beneficial information can be shared across architectures . GAN training is not involved in all stages of FL and cannot steal clients' Data generation is done by the (frozen) generator before the FL training by performing inference on random noise.


Lightweight and Robust Federated Data Valuation

Tang, Guojun, Zhou, Jiayu, Mamun, Mohammad, Drew, Steve

arXiv.org Artificial Intelligence

Federated learning (FL) faces persistent robustness challenges due to non-IID data distributions and adversarial client behavior. A promising mitigation strategy is contribution evaluation, which enables adaptive aggregation by quantifying each client's utility to the global model. However, state-of-the-art Shapley-value-based approaches incur high computational overhead due to repeated model reweighting and inference, which limits their scalability. We propose FedIF, a novel FL aggregation framework that leverages trajectory-based influence estimation to efficiently compute client contributions. FedIF adapts decentralized FL by introducing normalized and smoothed influence scores computed from lightweight gradient operations on client updates and a public validation set. Theoretical analysis demonstrates that FedIF yields a tighter bound on one-step global loss change under noisy conditions. Extensive experiments on CIFAR-10 and Fashion-MNIST show that FedIF achieves robustness comparable to or exceeding SV-based methods in the presence of label noise, gradient noise, and adversarial samples, while reducing aggregation overhead by up to 450x. Ablation studies confirm the effectiveness of FedIF's design choices, including local weight normalization and influence smoothing. Our results establish FedIF as a practical, theoretically grounded, and scalable alternative to Shapley-value-based approaches for efficient and robust FL in real-world deployments.


Accelerating Privacy-Preserving Federated Learning in Large-Scale LEO Satellite Systems

Guo, Binquan, Cao, Junteng, Siew, Marie, Chen, Binbin, Quek, Tony Q. S., Han, Zhu

arXiv.org Artificial Intelligence

Abstract--Large-scale low-Earth-orbit (LEO) satellite systems are increasingly valued for their ability to enable rapid and wide-area data exchange, thereby facilitating the collaborative training of artificial intelligence (AI) models across geographically distributed regions. Due to privacy concerns and regulatory constraints, raw data collected at remote clients cannot be centrally aggregated, posing a major obstacle to traditional AI training methods. Federated learning offers a privacy-preserving alternative by training local models on distributed devices and exchanging only model parameters. However, the dynamic topology and limited bandwidth of satellite systems will hinder timely parameter aggregation and distribution, resulting in prolonged training times. To address this challenge, we investigate the problem of scheduling federated learning over satellite networks and identify key bottlenecks that impact the overall duration of each training round. We propose a discrete temporal graph-based on-demand scheduling framework that dynamically allocates communication resources to accelerate federated learning. Simulation results demonstrate that the proposed approach achieves significant performance gains over traditional statistical multiplexing-based model exchange strategies, reducing overall round times by 14.20% to 41.48%. Moreover, the acceleration effect becomes more pronounced for larger models and higher numbers of clients, highlighting the scalability of the proposed approach.


Curriculum Guided Personalized Subgraph Federated Learning

Kang, Minku, Park, Hogun

arXiv.org Artificial Intelligence

Subgraph Federated Learning (FL) aims to train Graph Neural Networks (GNNs) across distributed private subgraphs, but it suffers from severe data heterogeneity. To mitigate data heterogeneity, weighted model aggregation personalizes each local GNN by assigning larger weights to parameters from clients with similar subgraph characteristics inferred from their current model states. However, the sparse and biased subgraphs often trigger rapid overfitting, causing the estimated client similarity matrix to stagnate or even collapse. As a result, aggregation loses effectiveness as clients reinforce their own biases instead of exploiting diverse knowledge otherwise available. To this end, we propose a novel personalized subgraph FL framework called Curriculum guided personalized sUbgraph Federated Learning (CUFL). On the client side, CUFL adopts Curriculum Learning (CL) that adaptively selects edges for training according to their reconstruction scores, exposing each GNN first to easier, generic cross-client substructures and only later to harder, client-specific ones. This paced exposure prevents early overfitting to biased patterns and enables gradual personalization. By regulating personalization, the curriculum also reshapes server aggregation from exchanging generic knowledge to propagating client-specific knowledge. Further, CUFL improves weighted aggregation by estimating client similarity using fine-grained structural indicators reconstructed on a random reference graph. Extensive experiments on six benchmark datasets confirm that CUFL achieves superior performance compared to relevant baselines. Code is available at https://github.com/Kang-Min-Ku/CUFL.git.


Towards Heterogeneity-Aware and Energy-Efficient Topology Optimization for Decentralized Federated Learning in Edge Environment

Liu, Yuze, Zhang, Tiehua, Shen, Zhishu, Wu, Libing, Chen, Shiping, Jin, Jiong

arXiv.org Artificial Intelligence

Federated learning (FL) has emerged as a promising paradigm within edge computing (EC) systems, enabling numerous edge devices to collaboratively train artificial intelligence (AI) models while maintaining data privacy. To overcome the communication bottlenecks associated with centralized parameter servers, decentralized federated learning (DFL), which leverages peer-to-peer (P2P) communication, has been extensively explored in the research community. Although researchers design a variety of DFL approach to ensure model convergence, its iterative learning process inevitably incurs considerable cost along with the growth of model complexity and the number of participants. These costs are largely influenced by the dynamic changes of topology in each training round, particularly its sparsity and connectivity conditions. Furthermore, the inherent resources heterogeneity in the edge environments affects energy efficiency of learning process, while data heterogeneity degrades model performance. These factors pose significant challenges to the design of an effective DFL framework for EC systems. To this end, we propose Hat-DFed, a heterogeneity-aware and coset-effective decentralized federated learning (DFL) framework. In Hat-DFed, the topology construction is formulated as a dual optimization problem, which is then proven to be NP-hard, with the goal of maximizing model performance while minimizing cumulative energy consumption in complex edge environments. To solve this problem, we design a two-phase algorithm that dynamically constructs optimal communication topologies while unbiasedly estimating their impact on both model performance and energy cost. Additionally, the algorithm incorporates an importance-aware model aggregation mechanism to mitigate performance degradation caused by data heterogeneity.